Biomarkers for predicting response of esophageal cancer patient to chemoradiotherapy

ABSTRACT

The present invention relates to novel genetic markers associated with response of a patient with esophageal cancer (ECa) to chemoradiation therapy, and particularly to methods and kits for predicting an ECa patient&#39;s response to chemoradiation therapy by genotyping of the markers.

FIELD OF THE INVENTION

The present invention generally relates to genetic markers, and moreparticularly to single nucleotide polymorphisms associated with theresponse of a patient having esophageal cancer to chemoradiotherapy.

BACKGROUND OF THE INVENTION

Esophageal cancer (ECa) has become the 6^(th) leading cause of cancerdeath in the world, and its incidence rate continues to increaseworldwide. Unfortunately, most patients with esophageal cancer haveadvanced disease at the time of initial diagnosis and ineligible forcurative surgical resection. Recently, multimodality therapies have beenattempted to improve the resectability of tumors and the long-termsurvival of patients. Among them, concurrent chemoradiation therapy(CCRT) in a neoadjuvant setting followed by esophagogastrectomy has beenwidely applied in current clinical practice. However, it is found thatindividual variation in response to CCRT exists and is associated withdifferent treatment outcomes. Patients with a complete response to CCRTtends to have an increased survival rate, but survival of patientswithout an evident response to CCRT may be compromised due totreatment-related toxicity and delays in surgical resection. Althoughstudies have focused for biomarkers associated with the patients'response to chemoradiotherapy (The pharmacogenomics journal 2009;9:202-7; Cancer Lett 2008; 260:109-17; and Int J Cancer 2008;123:826-30), no reliable genetic markers are currently available.

There is still a need for a genetic marker that is predictive of an ECapatient's response to chemoradiotherapy, and thus helpful in preventingunnecessary treatments and determining the most appropriate treatmentfor patients.

BRIEF SUMMARY OF THE INVENTION

In one aspect, the present invention provides a method of predictingresponse of a patient suffering from esophageal cancer tochemoradiotherapy, which comprises genotyping a test sample from thepatient for a single nucleotide polymorphism (SNP) marker selected fromthe group consisting of rs4954256 and rs16863886 and a combinationthereof, wherein the presence of a C allele in rs4954256, a G allele inrs16863886 or both is indicative of an increased likelihood of having acomplete response to chemoradiotherapy.

In another aspect, the present invention provides a kit for performingthe method as described herein comprising one or more isolatedpolynucleotides for conducting the genotyping of rs4954256, rs16863886or a combination thereof. In one embodiment, the kit comprises a firstset of isolated polynucleotides for conducting the genotyping ofrs16863886. In another embodiment, the kit comprises a second set ofisolated polynucleotides for conducting the genotyping of rs4954256.

The various embodiments of the present invention are described indetails below. Other characteristics of the present invention will beclearly presented by the following detailed description about thevarious embodiments and claims.

It is believed that a person of ordinary knowledge in the art where thepresent invention belongs can utilize the present invention to itsbroadest scope based on the description herein with no need of furtherillustration. Therefore, the following description should be understoodas of demonstrative purpose instead of limitative in any way to thescope of the present invention.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

The foregoing summary, as well as the following detailed description ofthe invention, will be better understood when read in conjunction withthe appended drawings. For the purpose of illustrating the invention,there are shown in the drawings embodiments which are presentlypreferred. It should be understood, however, that the invention is notlimited to the preferred embodiments shown.

In the drawings:

FIG. 1 shows an overview of the study design for the examples below.

DETAILED DESCRIPTION OF THE INVENTION

The present invention features two SNP markers, rs4954256 andrs16863886, identified by a two-stage genome-wide association study(GWAS), which are significantly associated with a complete CCRT responseof an ECa patient and provide a high level of prediction accuracy.

Unless defined otherwise, all technical and scientific terms used hereinhave the meaning commonly understood by a person skilled in the art towhich this invention belongs.

The articles “a” and “an” are used herein to refer to one or more thanone (i.e., at least one) of the grammatical object of the article.

As used herein, the term “polynucleotide”, “nucleic acid” or “nucleicacid molecule” refers to a polymer composed of nucleotide units,including naturally occurring nucleic acids, such as deoxyribonucleicacid (“DNA”) and ribonucleic acid (“RNA”) as well as nucleic acidanalogs including those which have non-naturally occurring nucleotides.Polynucleotide can be synthesized, for example, using an automated DNAsynthesizer. The term “nucleic acid” or “nucleic acid molecule”typically refers to a large polynucleotide. It will be understood thatwhen a nucleic acid fragment is represented by a DNA sequence (i.e., A,T, G, C), this also includes an RNA sequence (i.e., A, U, G, C) in which“U” replaces “T.”

As used herein, the term “isolated” with respect to nucleic acids, suchas DNA or RNA, refers to molecules separated from other DNAs or RNAs,respectively, which are present in the natural source of themacromolecule. The term isolated as used herein also refers to a nucleicacid that is substantially free of cellular material, viral material, orculture medium when produced by recombinant DNA techniques, or chemicalprecursors or other chemicals when chemically synthesized. Moreover, an“isolated nucleic acid” is meant to include nucleic acid fragments whichare not naturally occurring as fragments and would not be found in thenatural state.

As used herein, the term “allele” refers to variants of a nucleotidesequence. A biallelic polymorphism has two forms. Diploid organisms maybe homozygous or heterozygous for an allelic form.

As used herein, the term “SNP” refers to single nucleotide polymorphismsin DNA. SNPs are usually preceded and followed by highly conservedsequences that vary in less than 1/100 or 1/1000 members of thepopulation. An individual may be homozygous or heterozygous for anallele at each SNP position. A SNP may, in some instances, be referredto as a “cSNP” to denote that the nucleotide sequence containing the SNPis an amino acid “coding” sequence. A SNP may arise from a substitutionof one nucleotide for another at the polymorphic site. Substitutions canbe transitions or transversions. A transition is the replacement of onepurine nucleotide by another purine nucleotide, or one pyrimidine byanother pyrimidine. A transversion is the replacement of a purine by apyrimidine, or vice versa. For example, if at a particular chromosomallocation, one member of a population has an adenine (A) and anothermember of the population has a cytosine (C) at the same position, thenthis position is a SNP. Alleles for SNP markers as referred to hereinare expressed by the bases A, C, G or T as they occur at the polymorphicsite in the SNP assay employed.

The nomenclature of SNPs as described herein refers to the officialReference SNP (rs) ID identification tag as assigned to each unique SNPby the National Center for Biotechnological Information (NCBI). Thedatabase is assessible to the public atwww.ncbi.nlm.nih.gov/SNP/index.html.

As used herein, the term “genotype” means the identification of thealleles present in an individual or a sample. The term “genotyping” asample or an individual for a genetic marker may comprise determinationof which allele or alleles an individual carries for one or more SNPs.For example, a particular nucleotide in a genome may be an A in someindividuals and a C in other individuals. Those individuals who have anA at the position have the A allele and those who have a C have the Callele. In a diploid organism the individual will have two copies of thesequence containing the polymorphic position. So the individual may havean A allele and a C allele, or alternatively two copies of the A allele,or two copies of the C allele. Each allele may be present at a differentfrequency in a given population. Those individuals who have two copiesof the C allele are homozygous for the C allele and the genotype is CC,those individuals who have two copies of the A allele are homozygous forthe A allele and the genotype is AA, and those individuals who have onecopy of each allele are heterozygous and the genotype is AC.

As used herein, the terms “chemoradiation therapy,” “chemoradiotherapy,”“chemoirradiation” and “concurrent chemoradiation therapy (CCRT)” areinterchangeable to refer to combination of chemotherapy andradiotherapy.

As used herein, the teen “a complete response” to CCRT refers tocomplete remission of tumor without measurable symptoms such asmicroscopic residual tumor, grossly visible residual tumor, orprogression of tumor.

As used herein, the term “primer” refers to a specific oligonucleotidesequence which is complementary to a target nucleotide sequence and usedto hybridize to the target nucleotide sequence. A primer serves as aninitiation point for nucleotide polymerization catalyzed by either DNApolymerase, RNA polymerase or reverse transcriptase.

As used herein, the term “probe” refers to a defined nucleic acidsegment (or nucleotide analog segment, e.g., polynucleotide as definedherein) which can be used to identify a specific polynucleotide sequencepresent in samples, said nucleic acid segment comprising a nucleotidesequence complementary of the specific polynucleotide sequence to beidentified.

In one aspect, the present invention provides a method of predictingresponse of a patient suffering from esophageal cancer tochemoradiotherapy comprising genotyping a test sample from the patientfor a SNP marker selected from the group consisting of rs4954256,rs16863886 and a combination thereof, wherein the presence of a C allelein rs4954256, a G allele in rs16863886 or both is indicative of anincreased likelihood of having a complete response to chemoradiotherapy.

Table 1 shows the naturally occurring nucleotide sequences (homosapiens) containing rs4954256 (SEQ ID NO: 1) and the nucleotidesequences containing rs16863886 (SEQ ID NO: 2), obtained from the NCBI'sdatabase, wherein the nucleotide within the brackets is the polymorphicnucleotide. It shows that the polymorphic nucleotide of rs4954256 islocated at position 27 of SEQ ID NO: 1, and the polymorphic nucleotideof s16863886 is located at position 27 of SEQ ID NO: 2, respectively.

TABLE 1 SNP nucleotide sequences rs4954256atattggagagttaacagagaatgcc[C/T]aaaactggaaaaacaaaaacttcaa (SEQ ID NO: 1)rs16863886 aatggtgtcccttgaaggctatctgt[C/T]tgcttttggataaaatggacagaag(SEQ ID NO: 2)

SNP rs4954256 is located on chromosome 2q21.3 in ZRANB3, which belongsto the SMARCAL1 subfamily. The N-terminal of ZRANB3 contains a helicasefollowed by a zinc finger related to the Ran G protein binding proteins.However, the biological function of ZRANB3 is still unclear. SNPrs16863886 is located on chromosome 2q36.1 between SGPP2 and FARSB.FARSB encodes the phenylalanyl-tRNA synthetase beta subunit(s), whichare regulatory subunits that form a tetramer with two catalytic alphasubunits. SGPP2 encodes an S1P (sphingosine-1-phosphate)-specificphosphohydrolase, which dephosphorylates S1P into Sphingosine. BothSGPP1 and ZRANB3 are involved in the G-protein function.

A test sample useful for practicing the method of the invention can beany biological sample of a patient with esophageal cancer that containsnucleic acid molecules, including portions of the gene sequences to beexamined As such, the sample can be a cell, tissue or organ sample, orcan be a sample of a biological material such as blood, milk, tears,saliva, hair, skin, tissue, and the like. A nucleic acid sample usefulfor practicing a method of the invention can be DNA or RNA, particularlygenomic DNA or an amplification product thereof. A specifc example of atest sample in accordance with the invnetion is a blood sample.

In one embodiment, the method of the invention is conducted bygenotyping a test sample of a patient with esophageal cancer forrs4954256 wherein the presence of a C allele in the SNP marker isindicative of an increased likelihood of having a complete response tochemoradiotherapy.

In another embodiment, the method of the invention is conducted bygenotyping a test sample of a patient with esophageal cancer forrs16863886 wherein the presence of a G allele in the SNP marker isindicative of an increased likelihood of having a complete response tochemoradiotherapy.

In yet another embodiment, the method of the invention is conducted bygenotyping a test sample of a patient with esophageal cancer for acombination of rs4954256 and rs16863886 wherein the presence of both a Callele in rs4954256 and a G allele in rs16863886 is indicative of anincreased likelihood of having a complete response to chemoradiotherapy.

In particular, the patient evaluated for a likelihood of having acomplete response to chemoradiotherapy in accordance with the method ofthe invention is a human adult with esophageal cancer. Typically, thepatients are 18 years or older but younger than 70 years old. In certainembodiments, the patients are older than 25, 35, 45 or 55 but youngerthan 70 years old. In addition, in certain embodiments, the patient asdescribed herein is an Asian patient, particularly a Chinese or Japanesepatient. In certain embodiments, the patient is a man.

Numerous methods are known in the art for determining the nucleotideoccurrence for a specific SNP in a sample. Such methods can utilize oneor more oligonucleotide probes or primers, including, for example, anamplification primer pair that selectively hybridizes to a targetpolynucleotide, which corresponds to one or more SNP positions.Oligonucleotide probes useful in practicing a method of the inventioncan include, for example, an oligonucleotide that is complementary toand spans a portion of the target polynucleotide, including the positionof the SNP, wherein the presence of a specific nucleotide at theposition (i.e., the SNP) is detected by the presence or absence ofselective hybridization of the probe. Such a method can further includecontacting the target polynucleotide and hybridized oligonucleotide withan endonuclease, and detecting the presence or absence of a cleavageproduct of the probe, depending on whether the nucleotide occurrence atthe SNP site is complementary to the corresponding nucleotide of theprobe.

An oligonucleotide ligation assay also can be used to identify anucleotide occurrence at a polymorphic position, wherein a pair ofprobes that selectively hybridize upstream and adjacent to anddownstream and adjacent to the site of the SNP, and wherein one of theprobes includes a terminal nucleotide complementary to a nucleotideoccurrence of the SNP. Where the terminal nucleotide of the probe iscomplementary to the nucleotide occurrence, selective hybridizationincludes the terminal nucleotide such that, in the presence of a ligase,the upstream and downstream oligonucleotides are ligated. As such, thepresence or absence of a ligation product is indicative of thenucleotide occurrence at the SNP site. An example of this type of assayis the SNPlex System (Applied Biosystems, Foster City, Calif.).

An oligonucleotide also can be useful as a primer, for example, for aprimer extension reaction, wherein the product (or absence of a product)of the extension reaction is indicative of the nucleotide occurrence. Inaddition, a primer pair useful for amplifying a portion of the targetpolynucleotide including the SNP site can be useful, wherein theamplification product is examined to determine the nucleotide occurrenceat the SNP site. In this regard, useful methods include those that arereadily adaptable to a high throughput format, to a multiplex format, orto both. The primer extension or amplification product can be detecteddirectly or indirectly and/or can be sequenced using various methodsknown in the art. Amplification products which span a SNP can besequenced using traditional sequence methodologies such as the“dideoxy-mediated chain termination method,” also known as the “SangerMethod” and the “chemical degradation method,” also known as the“Maxam-Gilbertmethod.”

Medium to high-throughput systems for analyzing SNPs, known in the artsuch as the Mass Array™ system (Sequenom, San Diego, Calif.), theBeadArray™ SNP genotyping system (San Diego, Calif.), AffymetrixGeneChip® Human Mapping 500K arrays (Affymetrix, Inc., Santa Clara,Calif.), can be used with the present invention.

The SNP detection methods for practicing the present invention typicallyutilize selective hybridization. As used herein, the term “selectivehybridization” or the like refers to hybridization under moderatelystringent or highly stringent conditions so that a nucleotide sequencepreferentially associates with a selected nucleotide sequence overunrelated nucleotide sequences to a large enough extent to be useful inidentifying a nucleotide occurrence of a SNP. It will be recognized thatsome amount of non-specific hybridization is unavoidable, but isacceptable provide that hybridization to a target nucleotide sequence issufficiently selective such that it can be distinguished over thenon-specific hybridization, for example, at least about 2-fold moreselective, generally at least about 3-fold more selective, usually atleast about 5-fold more selective, and particularly at least about10-fold more selective, as determined, for example, by an amount oflabeled oligonucleotide that binds to target nucleic acid molecule ascompared to a nucleic acid molecule other than the target molecule,particularly a substantially similar (i.e., homologous) nucleic acidmolecule other than the target nucleic acid molecule. Conditions thatallow for selective hybridization can be determined empirically, or canbe estimated based, for example, on the relative GC: AT content of thehybridizing oligonucleotide and the sequence to which it is tohybridize, the length of the hybridizing oligonucleotide, and thenumber, if any, of mismatches between the oligonucleotide and sequenceto which it is to hybridize (see, for example, Sambrook et al., 2001,Molecular Cloning: A Laboratory Manual, Third Edition, Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y.; and Current Protocolsin Molecular Biology (Ausubel et al., ed., J. Wiley & Sons Inc., NewYork, 1988).

Generally, stringent conditions are selected to be about 5-30° C. lowerthan the thermal melting point (T_(m)) for the specified sequence at adefined ionic strength and pH. Alternatively, stringent conditions areselected to be about 5-15° C. lower than the T_(m) for the specifiedsequence at a defined ionic strength and pH. For example, stringenthybridization conditions will be those in which the salt concentrationis less than about 1.0 M sodium (or other salts) ion, typically about0.01 to about 1 M sodium ion concentration at about pH 7.0 to about pH8.3 and the temperature is at least about 25° C. for short probes (e.g.,10 to 50 nucleotides) and at least about 55° C. for long probes (e.g.,greater than 50 nucleotides). An exemplary non-stringent or lowstringency condition for a long probe (e.g., greater than 50nucleotides) would comprise a buffer of 20 mM Tris, pH 8.5, 50 mM KCl,and 2 mM MgCl₂, and a reaction temperature of 25° C.

According to the invention, the method as described herein can be usedto determine if a patient with esophageal cancer exhibits an increasedlikelihood of having a complete response to chemoradiotherapy based onthe genotype(s) of rs4954256 and/or rs16863886. As demonstrated in theexamples below, each of a C allele in rs4954256 and a G allele inrs16863886 is a protective allele which is more frequently present in apopulation of patients having a complete response to chemoradiotherapycompared to a population of patients that do not have a completeresponse to chemoradiotherapy, and therefore the presence of a C allelein rs4954256 or a G allele in rs16863886 indicates that the patients hasan increased possibility of having a complete response tochemoradiotherapy. Particularly, a patient exhibits a greaterpossibility (e.g. at least 1.5-, 2.0-, 2.5-, 3.0-, 3.5-, 4.0-, 4.5-, or5.0-fold risk) of having a complete response as the number of a C allelein rs4954256 or a G allele in rs16863886 increases. More particularly, apatient exhibits an about 4.54-fold chance of complete chemoradiotherapyresponse as the number of the C allele in rs4954256 increases, and anabout 3.84-fold chance of complete chemoradiotherapy response as thenumber of the G allele increases.

Isolated polynucleotides, serving as primers or probes, for example, canbe used to perform the genotyping for the SNPs in accordance with theinvnetion, which can readily be dtermined using the informationregarding SNPs and associated nucleic acid sequences provided herein. Anumber of computer progeams such as SeqTool Document v1.0 (IBMS, Taiwan)can be used to rapidly obtain optimal primer/probe sets. In one specificexample, a first pair of primers is used to perform the genotyping ofrs4954256 which have SEQ ID NOS: 3 and 4, respectively. In anotherspecific example, a second pair of isolated polynucleotides is used toperform the genotyping of rs16863886 which have SEQ ID NOS: 5 and 6,respectively. Table 2 shows the sequences of the primers.

TABLE 2 Primers nucleotide sequences rs4954256 forward primer5′-ACGTTGGATGTCTACCGTTTCCCGTATCTC-3′ (SEQ ID NO: 3) reverse primer3′-ACGTTGGATGCCATATTGGAGAGTTAACAG-5′ (SEQ ID NO: 4) rs16863886forward primer 5′-ACGTTGGATGCTGCTTAAGGCAATGGTGTC-3′ (SEQ ID NO: 5)reverse primer 3′-ACGTTGGATGTTACTTTGGCCCTTCTGTCC-5′ (SEQ ID NO: 6)

In another aspect, the invention also provides a kit for performing themethod as described herein comprising one or more isolatedpolynucleotides for conducting the genotyping of rs4954256, rs16863886or a combination thereof. Particularly, the isolated polynucleotides areused as primers or probes for conducting the genotyping of the SNPs inaccordance with the invention.

In some embodiments, the kits are PCR kits. In one example, the PCR kitincludes the following: (a) primers used to amplify a SNP as describeherein; and (b) buffers and enzymes including DNA polymerase.

In some embodiments, the kits are microarray kits. The kits generallycomprise probes attached to a solid support surface. The probes may belabeled with a detectable label. In a specific embodiment, the probesare specific for a SNP as described herein. The kits may also comprisehybridization reagents and/or reagents necessary for detecting a signalproduced when a probe hybridizes to a target nucleic acid sequence.Generally, the materials and reagents for the microarray kits are in oneor more containers. Each component of the kit is generally in its own asuitable container.

Primers or probes can readily be designed and synthesized by one ofskill in the art for the nucleic acid region of interest. It will beappreciated that suitable primers or probes to be used with theinvention can be designed using any suitable method.

A primer or probe is typically at least about 8 nucleotides in length.In one embodiment, a primer or a probe is at least about 10 nucleotidesin length. In a specific embodiment, a primer or a probe is at leastabout 12 nucleotides in length. In another specific embodiment, a primeror probe is at least about 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25nucleotides in length. While the maximal length of a probe can be aslong as the target sequence to be detected, depending on the type ofassay in which it is employed, it is typically less than about 50, 60,65, or 70 nucleotides in length. In the case of a primer, it istypically less than about 30 nucleotides in length. In a specificembodiment, a primer or a probe is within the length of about 18 andabout 28 nucleotides. However, in other embodiments, such as nucleicacid arrays and other embodiments in which probes are affixed to asubstrate, the probes can be longer, such as on the order of 30-70, 75,80, 90, 100, or more nucleotides in length.

In one example, the kit of the invention comprises a first set ofisolated polynucleotides for identifying the polymorphic nucleotide atrs4954256. Specifically, the isolated polynucleotides are primers havingSEQ ID NOS: 3 and 4, respectively. In another example, the kit of theinvention comprises a second set of isolated polynucleotides foridentifying the polymorphic nucleotide at rs16863886. Specifically, theisolated polynucleotides are primers having SEQ ID NOS: 5 and 6,respectively. In yet another example, the kit of the invention comprisesboth a first set and a second set of isolated polynucleotides as abovedescribed.

According to the invnetion, the kit may furhter contain other agentsused to detect the genetic polymorphisms such as (1) reagents forpurifying nucleic acids; (2) dNTPs, optionally with one or more uniquelylabeled dNTPs; (3) post synthesis labeling reagents, such as chemicallyactive derivatives of fluorescent dyes; (4) enzymes, such as reversetranscriptases, DNA polymerases, and the like; (5) various buffermediums, e.g., hybridization and washing buffers; (6) labeled probepurification reagents and components, like spin columns, etc.; and (7)signal generation and detection reagents, e.g., streptavidin-alkalinephosphatase conjugate and the like.

In some examples, the kit of the invention further comprisesinstructions for detecting the SNPs and evaluating the results.Specifically, the instructions describes that the presence of a C allelein rs4954256 or a G allele in rs16863886 as a result of the genotypingis indicative of an increased possibility of having a complete responseto chemoradiotherapy. More specifically, the instructions describes thata patient evaluated exhibits an about 4.54-fold chance of completechemoradiotherapy response as the number of the C allele in rs4954256increases, and an about 3.84-fold chance of complete chemoradiotherapyresponse as the number of the G allele increases.

The various embodiments of the present invention are described indetails below. Other characteristics of the present invention will beclearly presented by the following detailed description about thevarious embodiments and claims.

EXAMPLE 1 Patient Population and Therapy

This study included ninety (90) ECa patients, males younger than 70years, who received neoadjuvant CCRT followed by esophagectomy at theNational Taiwan University Hospital. Informed consent was obtained fromeach subject and this study was approved by the institutional reviewboard at National Taiwan University Hospital. Peripheral blood sampleswere drawn for each patient before surgery and chemoirradiation.Peripheral white blood cells were isolated and stored for furtherexamination.

CCRT was conducted using the cisplatin-based regimen (6 mg/m² on day 1and day 5 each week) plus 5-fluorouracil (225 mg/m² per day) and/orpaclitaxel (35 mg/m² on day 1 and day 4 each week) with concomitant 4000cGy of irradiation. The neoadjuvant irradiation was delivered using astandard antero-posterior/postero-anterior field technique. Four to sixweeks after CCRT, esophagectomy and esophageal reconstruction withgastric or colonic interposition was performed for patients withresectable tumors and acceptable surgical risk according tocardiopulmonary function, nutritional status and general performancestatus.

Based on the pathologic evaluation of the surgical specimen, patientsreceiving CCRT were then categorized into two groups, completeresponders for which pathologically complete remission was observed, andpoor responders for which grossly visible residual tumor and/orprogression of tumor was observed. Table 3 summarizes the clinicalcharacteristics of the 90 patents.

TABLE 3 Total Patients (n = 90) Complete responders Poor responders (n =44) (n = 46) Age (years) mean 55.20 54.87 SD 7.2836 8.2263 Cigarettesmoking Yes 32 31 No 7 6 Missing 5 9 Alcohol consumption Yes 32 31 No 76 Missing 5 9 Betel nut chewing Yes 13 12 No 26 25 Missing 5 9

EXAMPLE 2 Two-Stage Genome-Wide Association Study (GWAS)

A two-stage GWAS was performed to identify markers for predicting CCRTresponse in ECa patients. FIG. 1 shows an overview of the study design.

Stage 1

About 30% of the study population was included in Stage 1 as recommendedin Nat Genet 2006; 38:209-13. Therefore, 15 patients each were randomlydrawn from the CCRT complete responders (n=44) and poor responders(n=46), respectively. Table 4 summarizes the clinical characteristics ofthe 30 patients and those of remaining 60 patients.

TABLE 4 Patients in Stage 1 Remaining patients (n = 30) (n = 60)Complete Poor Complete Poor responders responders responders responders(n = 15) (n = 15) (n = 29) (n = 31) Age (years) mean 55.67 54.6 54.97 55SD 6.2526 7.8944 7.8580 8.5049 Cigarette smoking Yes 11 10 21 21 No 2 35 3 Missing 2 2 3 7 Alcohol consumption Yes 11 10 21 21 No 2 3 5 3Missing 2 2 3 7 Betel nut chewing Yes 4 5 9 7 No 9 8 17 17 Missing 2 2 37

Genomic DNA extracted from blood samples were isolated by proteinaseK-phenol-chloroform extraction following standard protocols with 0.5%SDS and 200 μg/ml proteinase K. Total genomic DNA (250 ng) was digestedwith a restriction enzyme (Nsp I or Sty I) and ligated to adaptors thatrecognize the cohesive four base-pair (bp) overhangs. All fragmentsresulting from restriction enzyme digestion, regardless of size, weresubstrates for adaptor ligation. A generic primer that recognizes theadaptor sequence was used to amplify adaptor-ligated DNA fragments. PCRconditions had been optimized to preferentially amplify fragments in the200 to 1,100 bp size range. The amplified DNA was then fragmented,labeled, and hybridized to GeneChip® Human Mapping 500K arrays(Affymetrix, Inc., Santa Clara, Calif.). After 16 hours of hybridizationat 49° C., the arrays were washed by Fluidics Station 450 and scanned byGeneChip Scanner 3000.

Fisher's exact test was used to investigate the association betweenindividual SNP and the CCRT response. Based on the two criteria (1) atleast three continuous SNPs with a p-value<0.001 in a confined genomicregion; and (2) the most significant SNP within this region (Prostate2006; 66:1556-64), twenty-six (26) candidate SNPs were obtained in Stage1.

Stage 2

Genotypes of the 26 candidate SNPs were further verified for all 90patients using the MassARRAY system from Sequenom (San Diego, USA)according to the iPLEX protocol. The assay was based upon the annealingof a primer adjacent to the polymorphic site of interest. PCR-primersand extension-primers were designed using the software SeqTool Documentv1.0 (IBMS, Taiwan). The addition of a DNA polymerase, plus a cocktailmixture of nucleotides and terminators, allowed extension of the primerthrough the polymorphic site, and generated a unique mass product. Theresultant mass of the primer extension product was then analyzed byusing the MassARRAY TyperAnalyzer v3.3 software (Sequenom) to determinethe sequence of the nucleotides at the polymorphic site. The primerswere designed for two SNPs, rs4954256 and rs16863886 for PCRamplification as follows: 5′-ACGTTGGATGTCTACCGTTTCCCGTATCTC-3′ (SEQ IDNO: 3) and 3′-ACGTTGGATGCCATATTGGAGAGTTAACAG-5′ (SEQ ID NO: 4) forrs4954256; 5′-ACGTTGGATGCTGCTTAAGGCAATGGTGTC-3′ (SEQ ID NO: 5) and3′-ACGTTGGATGTTACTTTGGCCCTTCTGTCC-5′ (SEQ ID NO: 6) for rs16863886. Forthese two SNPs, the following PCR conditions were used: 0.5 mM of eachprimer, 200 mM dNTP, 2.5 units of Taq polymerase, a standard polymerasebuffer supplied with enzyme (1.5 mM MgCl2), and 150 ng of genomic DNA.The total volume of the PCR mix was 25 ml. The PCR temperature programwas: 95° C. denaturation for 5 min; 35 cycles of 1 min each at 95° C.,1.75 min at 55° C., and 1.75 min at 72° C.; and a final extension run at72° C. for 10 min. The PCR products were run on a 6% agarose gel at 50 Wfor 30 min.

Classification of SNPs was manually determined by the MassARRAYTyperAnalyzer v3.3 software (Sequenom, San Diego, USA). IndependentFisher's exact tests were performed for the remaining 60 samples (inaddition to the 30 samples in Stage 1). No significant differences wereobserved between these two populations (data not shown). Therefore, datafrom all 90 samples were pooled to gain higher statistical power instage 2. The mean ages were not significantly different between completeand poor responders in both stage 1 and 2 (data not shown). In addition,none of the clinical characteristics, including cigarette smoking,alcohol consumption, and betel nut chewing, were significantly differentbetween complete and poor responders, and thus were not included infurther analyses (data not shown). Genotypes of the 30 patients in stage1 were verified by Sequenom data; and 8 SNPs with high replication erroror low call rate were excluded for further analysis. Samples in stage 2were consisted of the 30 patients and the remained 60 patients.

Further, a Bonferroni correction was used in Stage 2 in order to addressthe issue of multiple testing. After correction, results of the additivemodels showed that two SNPs, rs4954256 and rs16863886, weresignificantly associated with the CCRT response (rs4954256: OR=3.84, 95%CI=1.56-9.43,p-value=0.002; rs16863886: OR=4.54, 95% CI=1.81-11.40,p-value=9×10⁻⁴). Table 5 lists statistic data of the 18 candidate SNPsin Stages 1 and 2.

TABLE 5 stage 1* stage 2** (n = 90) (n = 30) MAF^(§) Minor Complete PoorSNP location gene allele p-value^(§§) responders responders p-value^(§§)OR (95% CI)^() rs12713098 2p16.3 XRXN1 A 7.64 × 10⁻⁵ 0.500 0.3667 0.0921.66 (0.87-3.17) rs4954256 2q21.3 ZRANB3 C 3.85 × 10⁻⁵ 0.2841 0.09300.002 3.84 (1.56-9.43) rs16863886 2q36.1 intergenic G 4.75 × 10⁻⁷ 0.28410.0870 9 × 10⁻⁴  4.54 (1.81-11.40) rs4284824 2q37.1 INPP5D C 5.35 × 10⁻⁵0.3182 0.4651 0.062 0.54 (0.29-1.01) rs4697204 4p15.31 intergenic G 1.02× 10⁻⁴ 0.4886 0.3256 0.032 1.98 (1.05-3.74) rs1876266 4p16.1 intergenicV 1.70 × 10⁻⁴ 0.3068 0.1413 0.012 2.88 (1.30-6.37) rs1440971 7q32.1intergenic C 1.73 × 10⁻⁵ 0.2045 0.1047 0.093 1.99 (0.88-4.54) rs16301409q22.2 intergenic C 1.31 × 10⁻⁴ 0.1364 0.3023 0.010 0.32 (0.14-0.74)rs1805740 12p31.13 PHC1 C 6.31 × 10⁻⁴ 0.2955 0.2093 0.224 1.53(0.78-3.00) rs4240039 Xp11.4 intergenic G 7.97 × 10⁻⁴ 0.1364 0.25580.057 0.68 (0.39-1.17) rs4830776 Xp22.2 intergenic C 1.23 × 10⁻⁴ 0.20450.3571 0.028 0.68 (0.42-1.10) rs5937044 Xq13.1 intergenic A 1.61 × 10⁻⁵0.2954 0.2791 0.868 1.04 (0.65-1.66) rs927142 Xq21.31 intergenic G 1.55× 10⁻⁵ 0.5 0.444 1 1.12 (0.73-1.85) rs5990542 Xq21.33 intergenic C 1.31× 10⁻⁴ 0.5227 0.4286 0.226 1.21 (0.79-1.85) rs5910842 Xq24 intergenic A3.56 × 10⁻⁷ 0.5 0.3043 0.010 1.41 (0.92-2.15) rs10521750 Xq25 intergenicC 1.91 × 10⁻⁶ 0.5682 0.3810 0.015 1.46 (0.95-2.25) rs5951775 Xq27.3intergenic T 1.68 × 10⁻⁵ 0.1591 0.0698 0.095 1.59 (0.78-3.24) rs1202918Xq28 intergenic A 4.53 × 10⁻⁵ 0.5114 0.3478 0.035 1.41 (0.92-2.15)^(§)MAF denotes the minor allele frequency. ^(§§)p-values were obtainedfrom Fisher's exact test. *p-values were calculated based on data from30 patients using microarrays. **Results in stage 2 were calculatedbased on all 90 samples examined by mass spectrometry. ^()OR denotesodds ratio and was calculated using the additive model.

In addition to Fisher's exact test, Cochran-Armitage trend test wasperformed to confirm that rs4954256 (p-value=0.002) and rs16863886(p-value=6×10⁻⁴) were significantly associated with the CCRT response asthe minor allele number increased. SNP rs16863886 is significantlyassociated with a 4.54-fold risk of complete CCRT response [95%confidence interval (CI)=1.81-11.40] as the number of minor alleleincreased; and SNP rs4954256 is associated with a 3.84-fold risk ofcomplete CCRT response (95% CI=1.56-9.43). In addition, based on theadditive model, the Leave-On-Out Cross Validation (LOOCV) accuracy forpredicting the CCRT response was 64.37% for rs4954256 (56 out of 87patients) and 68.89% for rs16863886 (62 out of 90 patients). Combiningboth SNPs together increased the prediction accuracy to 72.4% (63 out of87 patients). The sensitivity (correctly predicting the nonresponders)and specificity (correctly predicting the responders) of this study was70% and 75%, respectively. The positive prediction value was 71% andnegative prediction value was 73%. The regression coefficients ofrs4954256 and rs16863886 were 1.3572 and 1.5745 respectively, indicatingthat the probability of a complete CCRT response increased as the numberof minor alleles increased. These results demonstrated that rs4954256and rs16863886 were strongly associated with CCRT response in ECapatients and can be utilized in predicting CCRT responses.

This is the first two-stage GWAS to identify SNPs with a high accuracyfor predicting the CCRT response in treating ECa. Two SNPs, rs4954256and rs16863886, were found significantly associated with a complete CCRTresponse (as the minor allele number increased) and provided a highlevel of prediction accuracy (72.41%). Such germline polymorphismsdetermined from blood samples do not change over time. They are verystable, in contrast to somatic mutations obtained from tumor tissue,which change as the disease progresses. The use of the two SNPsaccording to the invention is helpful in predicting an ECa patient's theresponse to CCRT and then determining the most appropriate treatment forthe patient based on the result of the prediction.

1. A method of predicting an increased likelihood of response of a humanpatient with esophageal cancer to radiochemotherapy and subsequentesophagectomy, wherein the radiochemotherapy comprises radiation inconjunction with cisplatin, 5-fluorouracil and/or paciltaxtel,comprising: obtaining a sample from the human patient with esophagealcancer; detecting in the sample the presence of a guanine at thepolymorphic position of rs16863886; and predicting the human patientwith the guanine at the polymorphic position of rs16863886 has anincreased likelihood of responding to radiochemotherapy and subsequentesophagectomy than a subject without the guanine at the polymorphicposition of rs16863886.
 2. The method of claim 1, further comprisingproviding a pair of primers having the sequence of SEQ ID NOS: 5 and 6,respectively, to detect the guanine at the polymorphic position ofrs16863886.